41 research outputs found

    The Application of Preconditioned Alternating Direction Method of Multipliers in Depth from Focal Stack

    Get PDF
    Post capture refocusing effect in smartphone cameras is achievable by using focal stacks. However, the accuracy of this effect is totally dependent on the combination of the depth layers in the stack. The accuracy of the extended depth of field effect in this application can be improved significantly by computing an accurate depth map which has been an open issue for decades. To tackle this issue, in this paper, a framework is proposed based on Preconditioned Alternating Direction Method of Multipliers (PADMM) for depth from the focal stack and synthetic defocus application. In addition to its ability to provide high structural accuracy and occlusion handling, the optimization function of the proposed method can, in fact, converge faster and better than state of the art methods. The evaluation has been done on 21 sets of focal stacks and the optimization function has been compared against 5 other methods. Preliminary results indicate that the proposed method has a better performance in terms of structural accuracy and optimization in comparison to the current state of the art methods.Comment: 15 pages, 8 figure

    High-Accuracy Facial Depth Models derived from 3D Synthetic Data

    Get PDF
    In this paper, we explore how synthetically generated 3D face models can be used to construct a high accuracy ground truth for depth. This allows us to train the Convolutional Neural Networks (CNN) to solve facial depth estimation problems. These models provide sophisticated controls over image variations including pose, illumination, facial expressions and camera position. 2D training samples can be rendered from these models, typically in RGB format, together with depth information. Using synthetic facial animations, a dynamic facial expression or facial action data can be rendered for a sequence of image frames together with ground truth depth and additional metadata such as head pose, light direction, etc. The synthetic data is used to train a CNN based facial depth estimation system which is validated on both synthetic and real images. Potential fields of application include 3D reconstruction, driver monitoring systems, robotic vision systems, and advanced scene understanding

    Towards Synthetic Generation of Clinical Rosacea Images with GAN Models

    Get PDF
    Computer-aided skin disease diagnosis has recently attracted much attention in the scientific and medical research community due to advances in computer vision and machine learning algorithms. These methodologies essentially rely on large datasets collected from hospitals and medical professionals. Data scarcity is a vital problem in the medical domain, especially facial skin conditions, due to privacy concerns. For instance, some facial skin conditions, e.g. Rosacea, require observation of the entire face, which reveals the patient's identity. Rosacea is a lamentably neglected skin condition in the computer-aided diagnosis research community, due to the limited availability of Rosacea datasets. Hence, there is a need for exploring alternative ways to deal with the limited available data for Rosacea. A common approach to expanding small datasets is to utilise augmentation techniques. One of the most powerful augmentation methods in machine learning is Generative Adversarial Networks (GANs). Recently, GANs, principally the variants of StyleGAN, have successfully generated synthetic facial images. In this paper, a small dataset of a particular skin disease, Rosacea, with 300 images is used to examine the potential of a variant of StyleGAN known as StyleGAN2-ADA. The preliminary experiments and evaluations show promising signs towards addressing the data scarcity for computer-aided Rosacea diagnosis

    Skin disease analysis with limited data in particular Rosacea: a review and recommended framework

    Get PDF
    Recently, the rapid advancements in Deep Learning and Computer Vision technologies have introduced a new and exciting era in the field of skin disease analysis. However, there are certain challenges in the roadmap towards developing such technologies for real-life applications that must be investigated. This study considers one of the key challenges in data acquisition and computation, viz. data scarcity. Data scarcity is a central problem in acquiring medical images and applying machine learning techniques to train Convolutional Neural Networks for disease diagnosis. The main objective of this study is to explore the possible methods to deal with the data scarcity problem and to improve diagnosis with small datasets. The challenges in data acquisition for a few lamentably neglected skin conditions such as rosacea are an excellent instance to explore the possibilities of improving computer-aided skin disease diagnosis. With data scarcity in mind, the possible techniques explored and discussed include Generative Adversarial Networks, Meta-Learning, Few-Shot classification, and 3D face modelling. Furthermore, the existing studies are discussed based on skin conditions considered, data volume and implementation choices. Some future research directions are recommended

    Identifying Candidate Spaces for Advert Implantation

    Full text link
    Virtual advertising is an important and promising feature in the area of online advertising. It involves integrating adverts onto live or recorded videos for product placements and targeted advertisements. Such integration of adverts is primarily done by video editors in the post-production stage, which is cumbersome and time-consuming. Therefore, it is important to automatically identify candidate spaces in a video frame, wherein new adverts can be implanted. The candidate space should match the scene perspective, and also have a high quality of experience according to human subjective judgment. In this paper, we propose the use of a bespoke neural net that can assist the video editors in identifying candidate spaces. We benchmark our approach against several deep-learning architectures on a large-scale image dataset of candidate spaces of outdoor scenes. Our work is the first of its kind in this area of multimedia and augmented reality applications, and achieves the best results.Comment: Published in Proc. IEEE 7th International Conference on Computer Science and Network Technology, 201
    corecore